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Abstract 

Separability of multivariate function alleviates the difficulty in find- 
ing a minimum or maximum value of a function such that an optimal 
solution can be searched by solving several disjoint problems with lower 
dimensionalities. In most of practical problems, however, a function 
to be optimized is black-box and we hardly grasp its separability on 
ahead. In this study, we first describe a general separability condition 
which a function defined over an arbitrary domain must satisfy if and 
only if that function is separable with respect to given disjoint subsets 
of variables. By introducing an alternative separability condition, we 
propose a Monte Carlo-based algorithm to estimate the separability of 
a function defined over unit cube with respect to given disjoint sub- 
sets of variables. Moreover, we extend our algorithm to estimate the 
number of disjoint subsets and disjoint subsets themselves such that a 
function is separable with respect to them. Computational complexity 
of our extended algorithm is function-dependent and varies from linear 
to exponential in the dimension. 



1 Introduction 

Whether a given multivariate function is separable or not is one of the 
important measures of the difficulty in optimization. This can be easily 
understood through the following argument. Let f(x) be a function of s 
variables, i.e., x = (x\, . . . ,x s ), and let x u = (xj)j^ u be a subset of vari- 
ables for u C [1 : s](:= {l,...,s}). If f(x) is separable with respect to 
some x u and its complement X— u := 33[i :s ]\ u with y£ u C [1 : s], that is, 
f(x) = fi(x u ) + f2(x- u ), we can reduce one high-dimensional optimization 
problem to two disjoint optimization ones with lower dimensionalities. The 
values of be fixed while searching an optimal solution of fi(x u ), 

and vice versa. If f\(x u ) and f2(x- u ) are further separable with respect to 
some subsets x v and x w with / v C u and I / w C —u, respectively, 



1 



for instance, we can reduce to four disjoint optimization problems with 
even lower dimensionalities. As an extreme case, f{x) might be expressed 
simply as a sum of s one-dimensional functions, i.e., f(x) = ^2j = ± fj(xj). 
Then, the s-dimensional optimization problem can be decomposed into s 
one-dimensional ones. If f(x) is not separable with respect to any subset of 
variables, on the other hand, we have to search a whole s-dimensional space 
all at once. 

The performances of optimization algorithms, especially of heuristics and 
meta-heuristics, often depend on separability of the function. For instance, 
as discussed in [9], the performance of the genetic algorithm deteriorates 
if we rotate the coordinate of the separable benchmark functions, which 
makes the functions non-separable. Thus, in order to cover a wide class of 
functions, we generally compose a set of benchmark functions from many 
separable and non-separable functions for the performance comparison of 
different optimization algorithms, see such as [3, 6]. What matters in many 
practical problems, however, is that a function to be optimized is black-box 
so that we hardly grasp a priori its separability. If the function is separable, 
we cannot exploit the advantage of the algorithms which perform better 
for non-separable functions. Otherwise if the function is non-separable, we 
should avoid to use the algorithms which perform well only for separable 
functions. Therefore, we can claim that the separability of the function to 
be optimized is one of the central issues in choosing a suitable optimization 
algorithm. 

Motivated by the above concern, we investigate the separability of mul- 
tivariate functions in this study. Our approach is based on the functional 
decompositions given in the literature, see for example [2, 4, 8, 12]. These 
decompositions were recently generalized by Kuo et al. [5]. After intro- 
ducing the preliminaries on those decompositions in the next section, we 
first derive a general separability condition which a function defined on an 
arbitrary domain must satisfy if and only if that function is separable with 
respect to given disjoint subsets of variables in Section 3. As special cases, 
it includes the conditions for a function to be separable with respect to one 
subset of variables and its complement, or to be separable with respect to 
all the variables. In order to construct a computable algorithm to estimate 
the separability, we derive an alternative separability condition in Section 4, 
which is valid for the functions in £ 2 ([0, l] s ). Using this alternative condi- 
tion, we propose a Monte Carlo-based algorithm for the separability estima- 
tion. Moreover, we extend our proposed algorithm to estimate the number 
of disjoint subsets and disjoint subsets themselves such that a function is 
separable with respect to them. We show that computational complexity 
of our extended algorithm is function-dependent and varies from linear to 
exponential in the dimension. 
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2 Background and notation 



2.1 General decomposition formula 

In the following, we always write [1 : s] = {1, . . . , s}. For a given subset u of 
[1 : s], we denote by — u the complement of u, that is, —u = [1 : s] \ u, and 
denote by |tt| the cardinality of u. Now we consider a decomposition of a 
function of s variables f{x) £ F, where F is a linear space of real functions 
defined on a domain D QM S , into the following form 

f( X )= /«(*)■ 

uC[l:s] 

We note that the right-hand side consists of 2 s terms with each term f u (x) 
depending only on the subset of variables x u . According to [5, Theorem 2.1], 
f u (x) can be generally expressed as 

^■■= fn^-^oV-uc/)^). 

where {-P,- : j = 1, . . . , s} is a set of commuting projections on F defined on 
a domain DC1 S such that Pj(f)(x) = f(x) if f(x) does not depend on xj 
and that Pj(f)(x) does not depend on a^-. Further, we define P u = Y\j eu Pj 
for u C [1 : s] and denote by / the identity operator. We can rewrite (1) 
into the following recursive relation 

/„(aO:=P-«(/)(aO -£/«(*), (2) 

vCu 

where, for u = 0, we define 

/ (x) -^(Z)^). 

Since f%{x) is a constant, we simply write in the following. 

We show two important examples of Pj. One is called anchored decom- 
position, see such as [8, 14], which fixes Xj at tj 

P j(f)( x ) = f(xi,-- ■ ,Xj-l,tj,Xj + l, . . .,x s ). 

where the anchor t = (t\, . . . ,t s ) lies in D. The other with D = [0, l] s is 
called analysis of variance (ANOVA) decomposition, see such as [2, 4, 12], 
which integrates out Xj 

p j(f)( x ) = f(xi,...,x j -i,t j ,x j+1 ,...,x s )dt j . (3) 
J o 

The latter has often been used in the context of global sensitivity anal- 
ysis, which measures the relative importance of each subset of variables on 
the variation of function, see such as [1, 11, 12, 13]. Since we also use this 
decomposition in this study, the next subsection is devoted to explaining it 
in more detail. 



3 



2.2 ANOVA decomposition and Sobol' indices 

For any square integrable function f(x) £ L 2 ([0, each term f u (x) can 
be obtained by using (2) and (3) as 



fu(x)= f(x)dx_ u ~y2f v (x 

J[0A] a - H vCu 



where, for u = 0, we have 



h = / f(x)dx, 



'[0,1] 

which is simply the expectation of f(x). This decomposition satisfies the 
following important properties 



[ f u (x)dxj = 0, 
Jo 



for j € u with \u\ > 0, and 

/ f u (x)f v (x)dx = 0, 

if u 7^ The former can be proved by induction on The latter immedi- 
ately follows from the former by considering the integration with respect to 
Xj for any j € (u U v) \ (u fl v). Using this decomposition and its properties, 
the variance of f(x), which will be denoted by a 2 , can be expressed as 

a 2 = [ f(x) 2 dx-l[ f(x)dx) 

f u (x) 2 I dx 



where we have defined 




°l= / fl(x)dx. 



'[0,1 

This equality implies that the subset of variables x u with larger a 2 affects 
more on the variance of the function. In other words, the function f(x) is 
more sensitive to the change of values of x u with larger a 2 . That is why the 
ANOVA decomposition plays a central role in the global sensitivity analysis. 
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Sobol' indices were first introduced by Sobol' [12] and has recently been 
generalized by Owen [7] to measure the relative importance of a subset of 
variables. For ^ u C [1 : s], let us define 

and 

Here, r 2 is a sum of a 2 for t> contained in u, while f\ is a sum of cr 2 , for v 
which touches u. It is obvious that we have < r 2 < r 2 < a 2 . We often 
normalize these quantities such as r 2 I a 2 and t 2 /ct 2 . Fr om the definition, 
we have the following identity 



3 General separability condition 

In this section, we introduce a general separability condition, which must 
be satisfied for any separable function f(x) G F with respect to given m 
disjoint subsets of variables x Ul , . . . , x Um where x u . = (xi)i €Uj . Here we 
mean by m disjoint subsets that a set {u\, . . . ,u m } satisfies the following 
properties: Uj ^ for j = 1, . . . , m, 

ui n Uj = 0, 

if z ^ j, and 

m 

\J Uj = [l:s]. 

3=1 

Then the following theorem gives a general separability condition. 

Theorem 1 For m,s € N such that m < s, letu±,..., u m be m disjoint sub- 
sets of [1 : s]. A function f(x) € F is separable with respect to x Ul , . . . , x Um 
if and only if the following equation holds for any x € D 

m \ 

l[(l-P_ Uj ) J (/)(*) = 0. (4) 
In order to prove Theorem 1, we need the following lemma. 
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Lemma 1 For m,s € N such that m < s, let u±, . . . ,u m be m disjoint 
subsets of [1 : s] . Then, we have 

m m 

P- Uj )=I+(m- l)P [1:s] - p -uj ■ 



3=1 



3=1 



Proof. We note that P- Ui • P-uj = P[i-.s] f° r * 3 since Uj and are disjoint 
with each other. By using this fact and the following identity 



!!(«; + *;)=£ II 



3=1 



we have 



vci m \je-v 



IT. 



3=1 



e n< 

vC[l:m] \j€—v J \j€v 

E M>'"' fll" 



uC[l:m] 



3=1 



dC [l:m] 

kl>2 



'-£^ + 

3=1 



E 



i)C[l:m] 
V \v\>2 



Pi 



1:5 



In the last term, we have 



£ (-l)H = £ (_i)M _ ^ ("1) H 



i)C[l:m] 
|d|>2 



uC[l:m] 

(l-l) m -l+m. 



»C [l:m] 

kl<i 



Thus, the result follows. □ 

Now we are ready to prove Theorem 1. 

Proof. (Theorem 1) As shown in the proof of [5, Theorem 2.1], P- U {f){x) = 
SdCm fv(x). Applying this relation and Lemma 1, we have for the left-hand 
side~of (4) 



Y[{i-p_ Uj ) ) (/)(*) 

V3=l 



J+(m-l)P [1:s] -^P_ u . ] (/)( 

3=1 



X 
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m 

= /(a-) + ( m _!)/„-£ £ /«» 

= nx)- 1/0+f; e • 

\ i=i0^ujCttj / 
Given that this equals zero for any x £ D, we can rewrite (4) into 

m 

/n = /0 + E E 

j=l Q^VjCuj 

Since /0 is a constant and ui, . . . , ii m are disjoint with each other, this equa- 
tion implies that f(x) is separable with respect to x Ul , . . . ,x Um . The proof 
of the reverse direction is trivial. Hence, the result follows. □ 

Our general separability condition (4) consists only of function f{x) and 
projections (P u ) u c[i:s} an d does not include any representation (f u (x)) u c[i: S ]- 
We emphasize here that the condition (4) is not equal to (I—P^ Uj )(f)(x) = 
for at least one of j with 1 < j < m, which only gives 

/(*)= E 

Thus, (/ — P~ u .)(f)(x) = for some j is just a sufficient condition for f(x) 
to be separable with respect to x Ul , . . . , x Um . In the following, we describe 
the separability conditions for two special cases, both of which are important 
in practice. 

Corollary 1 A function f(x) € F is separable with respect to x u and x^ u 
if and only if the following equation holds for any x € D 

{I + P [1:s] - P u - P. u ) (/)(*) = 0. 

It immediately follows by inserting m = 2, u\ = u and U2 = —u into (4) and 
by applying Lemma 1. 

Corollary 2 A function f{x) G F is separable with respect to all the vari- 
ables if and only if the following equation holds for any x € D 

l + ( s - l)P [1:s] - E P-vyj (/)(*) = 0. 

It also immediately follows by inserting m = s and uj = {j} for j = 1, . . . , s 
into (4) and by applying Lemma 1. 
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4 Separability estimation of multivariate functions 



In the previous section, we have shown the general separability condition, 
which is a necessary and sufficient condition for f(x) to be separable with 
respect to given disjoint subsets of variables. It is quite difficult, however, to 
confirm whether a given black-box function f(x) satisfies this condition or 
not. Hence, in this section, we propose a computational algorithm based on 
Monte Carlo method to estimate the separability of fix). The key ingredient 
lies in the use of ANOVA decomposition and Sobol' indices. We need to 
restrict fix) 6 L 2 ([0, l] s ), while in many practical problems D C IR 5 can 
be replaced by [0, l] s using suitable transformation of variables and f(x) 
satisfies this restriction. 

The following theorem shows an alternative separability condition for 
fix) & L 2 ([0, l] s ), which will be used later in proposing a computational 
algorithm to estimate the separability of fix). 

Theorem 2 For m, s 6 N such that m < s, let u±, . . . , u m be m disjoint 
subsets of [1 : s]. A function fix) S £ 2 ([0, l] s ) is separable with respect to 
x Ul , . . . , x Um if and only if the following equation holds 

m 

E4 = - 2 - ( 5 ) 

Proof. From the definition of r 2 , it is possible to rewrite (5) into 

m 

This equation implies that for any subset v which is not a subset of Uj for 
j = 1, ... ,m, we have a 2 = and thus f v {x) := 0. Therefore, f(x) can be 
expressed as 

m 

j=l Qj^vjCiij 

The proof of the reverse direction is trivial. Hence, the result follows. □ 
Now we introduce the following notation. 

Definition 1 For m, s S N such that m < s, let u\, . . . , u m be m disjoint 
subsets of [1 : s]. We define a separability index with respect to ui, . . . ,u m , 
which is denoted by 7^ u , as follows. 

m 

2 2 _ 2 

lui,...,u m ~ a / j —Uj ■ 
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It is trivial from the definition that 7^ u range from to a 2 . Further, 
we emphasize that the condition of 7^ ... Um = is substituted for the condi- 
tion of Y^f=i I^ij = 0-2 i n Theorem 2. Our goal is to construct an algorithm 
which estimates 7^ ... Um of a black-box function f(x) computationally. In 
order to obtain a computable form for estimation of 7^ , we use the 
integral form of r„, see for example [7, 10] 

zl = / f(x)(f(x u ,z- u ) - f{z))dxdz, 

J[0,l] 2s 



and that of a 2 

2 



a 

'[0,1] 



/ f(x)(f(x)-f(z))dxdz, 



where x and z are identically and independent distributed in [0, l] s , and the 
s-dimensional vector (x u , z_ u ) denotes y = (y±, . . . , y s ) in which yj = Xj for 
j E u and yj = Zj for j E — u. Then, we have the following form of 7^ ... Um - 



7 2 

iui,...,u r , 



= [ f{x) f /(a) - f(z) -jr{f(x Uj ,Z- Uj ) - f(z))\ dxdz 
= / /(») /(as) + (m- l)/(z) - V /(#„ , ctecb. 

w v ^ j 

Since the integral can be approximated by using Monte Carlo method 
that averages with equal weights n evaluations at random points, we propose 
the following algorithm to estimate 7„ x ... Mm - 

Algorithm 1 (Estimation 0/7^ ... Um ) 

For to, s 6 N suc/i i/toi m < s, let u\, . . . , u m be m disjoint subsets of 
[1 : s] and let 7^ Mm 6e £/ie separability index as defined in Definition 1. 
For n £N, we proceed as follows. 

1. Generate Xi,Zi € [0, l] s for < i < n randomly. 

2. Compute the approximation 0/7^ ... Um 



■ j I I _L 1 lib \ 

lli,...,u m = ~J2f( x t) (/(a5i) + ("»-l)/(*i)-5^/(a5i I iy.*i I -iy) ) • (6) 
i=o \ j=l 

where Xi iUj = (xi t i)i eUj in which xu is the l-th component of x, and 
the same notation applies to Zi_ Uj . 
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It is obvious that the computational complexity of our algorithm is 
linear in m and n. Furthermore, when f(x) is separable with respect to 
x ui , . . . , x Um , our algorithm yields exactly zero for 7^ Um because the ex- 
pression in the parenthesis of (6) is zero for any Xi, Zi € [0, l] s . In order to 
find the disjoint subsets u±, . . . , u m such that 7^ Um is zero, however, we 
need to try so many possible candidates of {u±, . . . , u m } for m = 2, . . . , s. 
For making a systematic search for m and in, ... , u m , we use the following 
lemma. 

Lemma 2 That f(x) is separable with respect to x Ul , . . . ,x Um is equivalent 
to that f(x) is separable with respect to x Uj and X— Uj for j = 1, . . . , m. 

Since this lemma is trivial, we omit the proof. This lemma implies that 
it is sufficient to search u one- by-one whose value of j u ,-u is zero without u 
toughing the already found ones. Moreover, due to symmetry of u and —u, 
the overall search space of u can be reduced to 7^ u C [1 : s — 1] and we 
can simply write 7 n := j u ,-u- Based on these observations, we proceed the 
search in the following order 

u = {1}, 

u = {2}, {1,2}, 

u = {3}, {1,3}, {2,3}, {1,2,3}, 

u = {a-l},{M-l},...,{l,. ■.,«-!}. 

If 7 n turns out to be zero during this process, we can omit from the remaining 
candidates every subset that touches at least one element of u. 

For example, if s = 5 and f(x) is separable with respect to x\, £C/ 2) 4}, ^{3,5} 5 
we proceed the search as follows. 

u = {ir, 

u = {2}, 

u = {3}, {2, 3} 

u = {4}, {2,4}*, 

where * means that the corresponding subset of variables is found to be 
separable. Consequently, we obtain u\ = {1},U2 = {2,4}. From Lemma 2, 
we have m = 3 and U3 = {3, 5}. 

Hence, our extended algorithm to estimate the number of disjoint subsets 
m and disjoint subsets themselves ui, ... ,u m is given as follows. 

Algorithm 2 (Estimation of m and u±, . . . , u m ) 

For s, n 6 N, we proceed as follows. 
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1. Set r = m = 1 and generate £Cj, z% € [0, l] s for < % < n randomly. 

2. For each subset v such that v C [1 : r — 1] \ U^Tl 1 u v com V u ^ e 7^u{j} 
according to (6). If one finds v such that 7^,1 = 0, set u m = v U {j} 
and m = m + 1. 

5. Set r = r + 1. If r < s, go to step 2. 

The computational complexity of our extended algorithm is function- 
dependent as follows. When f(x) is separable with respect to all the vari- 
ables, our algorithm searches only u = {1}, . . . , {s} in this order. Hence, 
the computational complexity is minimized and becomes linear in s and n. 
When f(x) is not separable with respect to any subset of variables, on the 
other hand, our algorithm searches all the candidates ^ u C [1 : s] so that 
the computational complexity is maximized. Since the cardinality of u such 
that ^ u C [1 : s] is 2 s — 2, the computational complexity remains linear 
in n but becomes exponential in s. 

From this point, Algorithm 2 should work for small s but becomes in- 
feasible as s increases. How to overcome this drawback is open for further 
research. At this moment, for large s, Algorithm 1 with m = s and uj = {j} 
for j = 1 , . . . , s will be of use as an initial screening to estimate the separa- 
bility with respect to all the variables at one time, which can be done with 
the computational complexity linear in s. 
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